Large-scale integrative analysis of juvenile idiopathic arthritis for new insight into its pathogenesis

Background Juvenile idiopathic arthritis (JIA) is one of the most prevalent rheumatic disorders in children and is classified as an autoimmune disease (AID). While a robust genetic contribution to JIA etiology has been established, the exact pathogenesis remains unclear. Methods To prioritize biologically interpretable susceptibility genes and proteins for JIA, we conducted transcriptome-wide and proteome-wide association studies (TWAS/PWAS). Then, to understand the genetic architecture of JIA, we systematically analyzed single-nucleotide polymorphism (SNP)-based heritability, a signature of natural selection, and polygenicity. Next, we conducted HLA typing using multi-ethnicity RNA sequencing data. Additionally, we examined the T cell receptor (TCR) repertoire at a single-cell level to explore the potential links between immunity and JIA risk. Results We have identified 19 TWAS genes and two PWAS proteins associated with JIA risks. Furthermore, we observe that the heritability and cell type enrichment analysis of JIA are enriched in T lymphocytes and HLA regions and that JIA shows higher polygenicity compared to other AIDs. In multi-ancestry HLA typing, B*45:01 is more prevalent in African JIA patients than in European JIA patients, whereas DQA1*01:01, DQA1*03:01, and DRB1*04:01 exhibit a higher frequency in European JIA patients. Using single-cell immune repertoire analysis, we identify clonally expanded T cell subpopulations in JIA patients, including CXCL13+BHLHE40+ TH cells which are significantly associated with JIA risks. Conclusion Our findings shed new light on the pathogenesis of JIA and provide a strong foundation for future mechanistic studies aimed at uncovering the molecular drivers of JIA. Supplementary Information The online version contains supplementary material available at 10.1186/s13075-024-03280-2.


Introduction
Juvenile idiopathic arthritis (JIA) is one of the most common rheumatic diseases in children; it is mainly regarded as an autoimmune disease (AID) whose clinical manifestations include persistent limping, painful joints, stiffness, and inflammation [1,2].The exact JIA pathogenesis remains unclear; however, it has been demonstrated that there is a strong genetic contribution to the etiology of JIA.The single-nucleotide polymorphism (SNP)-based heritability for JIA was estimated to be 73%, among the most highly heritable pediatric AIDs.Even though genome-wide association studies (GWASs) have identified many risk variants of JIA, most are located in non-coding regions, making it difficult to interpret their functional consequences [3].By integrating GWASs with expression quantitative trait loci (eQTL), transcriptomewide association studies (TWASs) provide a powerful approach to prioritize susceptibility genes affected by risk variants [4].
Investigating the genetic architecture of complex diseases is essential for understanding the genetic basis of phenotypic variations and evolutions [5].For complex diseases, natural selection plays an essential role in forming the genetic architecture and provides valuable insights into biological mechanisms [6].In AIDs, the signature of negative selection is significant because it sheds light on how the human immune system has evolved to defend against pathogens while avoiding harmful responses against self-tissues.
Moreover, given that JIA is considered an AID, it is also important to further understand the underlying mechanisms of immune responses affecting the JIA etiologies.Human leukocyte antigen (HLA) molecules presenting peptide antigens to receptors on T lymphocytes are highly polymorphic at their peptide-binding site and mediate the adaptive immune responses [5].Moreover, T cell receptors (TCRs) interacting with HLA molecules are essential components of adaptive immune responses [7].Through the somatic recombination of TCRs, self-reactive T cells can be produced, and the recognition of selfantigens can affect the development of AIDs [8].Even though there is some evidence that T cells are involved in JIA etiologies, their contributions to the pathogenesis of AIDs including JIA have not been completely revealed [9,10].
Herein, we prioritized susceptibility genes/proteins for JIA by conducting a multi-tissue TWAS and proteomewide association study (PWAS) by integrating JIA GWAS summary statistics data (n = 3,305 cases and n = 9,196 controls) with reference eQTL/protein QTL (pQTL) panels (n = 14,037).We identified 19 genes and two proteins associated with JIA risk using TWAS and PWAS.We then estimated disease heritability and signatures of natural selection to understand the genetic architecture and to prioritize the most relevant tissue and cell types of JIA.We observed that the heritability of JIA was enriched in T lymphocytes and HLA regions and that JIA showed higher polygenicity than other AIDs.Next, HLA typing was conducted using multi-ancestry RNA sequencing (RNA-seq) data, and TCR repertoire analysis was performed at a single-cell level to investigate the associations between immunity and JIA risks.We found some HLA types, such as B*45:01, DQA1*01:01, DQA1*03:01, and DRB1*04:01, which were more frequent in the European or African JIA patients.In addition, we identified clonally expanded T cell subpopulations in JIA patients, among which CXCL13 + BHLHE40 + T cells were significantly associated with JIA risks at the single-cell level.We believe that these findings could provide new insights into understanding the underlying mechanisms of JIA etiology.

Transcriptome-wide association study and proteome-wide association study
To identify susceptibility genes associated with the pathogenesis of JIA, a TWAS was performed using functional summary-based imputation (FUSION) [4].Briefly, TWAS identifies risk genes associated with the target disease by integrating GWAS summary statistics data with the reference eQTL data of specific tissues, considering linkage disequilibrium (LD) structures.Using the eQTL panels and LD information, the cis-genetic components of gene expression are imputed from the JIA summary statistics data.Then, the predicted gene expression is used for association tests with JIA risks to identify significant associations between the gene expression and the disease.Twelve connective tissue panels from the Genotype-Tissue Expression project v7 (GTEx v7; n = 449), the Metabolic Syndrome in Men study (METSIM; n = 563), the Netherlands Twin Registry (NTR; n = 1247), and the Young Finns Study (YFS; n = 1264) were selected as expression weights for transcriptomic imputation (Table S1) [14][15][16][17][18][19].For each panel, predictive models for gene expression were trained using cis-regulated genes by SNPs within ±500kb of the transcription start site and are significant for heritability (cis-h 2 ) with P < 0.01.The European LD information from the 1000 Genomes project accounted for the LD regions [20].Due to the complex LD patterns, we excluded the TWAS associations from major histocompatibility complex (MHC) regions [21].The significance threshold of TWAS associations was corrected with the Bonferroni correction (P TWAS < 7.55 × 10 −07 , 0.05 /66,196).
We conducted a PWAS using the pQTL data from the INTERVAL (n = 3301) [22] and Atherosclerosis Risk in Communities (ARIC; n = 7213) study [23] of European individuals.For each pQTL panel, predictive models were trained using cis-regulatory proteins with SNPs within ±500 kb of the same transcriptional start site, and the significance levels of cis-h2 in INTERVAL (n = 1031 models) and ARIC (n = 1309 models) were 0.05 and 0.01, respectively.The predictive models in the INTERVAL study were fitted using the sum of single effects (SuSiE) [24], and the elastic net was used in the ARIC study.As with TWAS, the PWAS associations in the MHC region were excluded.The same significance thresholds with the Bonferroni correction were used for the PWAS associations as the TWAS associations (P PWAS < 2.16 × 10 −05 , 0.05/2311).

Fine-mapping of TWAS and PWAS associations
We performed fine-mapping of causal gene sets (FOCUS) to identify TWAS/PWAS associations responsible for the disease, estimating gene-trait associations at the GWAS risk regions while considering LD structures and controlling for pleiotropic SNP effects [25].FOCUS calculates the posterior inclusion probability (PIP) per the TWAS/ PWAS association and suggests credible gene sets containing susceptibility genes at a 90% confidence level.We applied FOCUS to the specific loci for each panel where the significant TWAS/PWAS associations (P TWAS < 7.55 × 10 −07 and P PWAS < 2.16 × 10 −05 ) identified by FUSION were detected.The weight database for FOCUS was generated from the GTEx and INTERVAL FUSION weights.

Pathway enrichment analysis
We conducted a TWAS-based Gene Set Enrichment Analysis (TWAS-GSEA) using the TWAS results from individual tissue panels [26].The TWAS associations with a panel and the 12 eQTL panels containing information on the position of genes were used as inputs for this analysis.We retrieved curated gene sets from canonical pathways (CP), WikiPathways, Kyoto Encyclopedia of Genes and Genomes (KEGG), BioCarta, Reactome, and Pathway Interaction Database (PID) from the Molecular Signatures Database (MSigDB v7.2) [27][28][29][30][31][32].

Genetic correlation analyses with other traits
To estimate genetic correlations at the genome-wide level between JIA and other AID-like (n = 10) and non-AID-like (n = 15) traits, an LD score regression was performed using the GWAS summary statistics data of JIA and 25 traits using LDSC [13,35].Sumstats-formatted publicly available summary statistics (PASS) data of 22 traits, excluding type 1 diabetes (T1D), were retrieved from LD Hub [36].The summary statistics data of T1D were obtained from the GWAS catalog (catalog ID: GCST005536) because T1D PASS data were not reported in the LD hub, although the comorbidity of JIA and T1D has previously been reported [11,37,38].Summary statistics data of AID ALL and AID SURE traits, the remaining two traits, reported in UK Biobank (UKBB) were also used.The LD score data for the European samples were used for this analysis [20].
To estimate transcriptome-wide genetic correlations between JIA and the 23 traits, the TWAS results of these traits were retrieved from the TWAS-hub [4].Eighteen of the 23 traits were obtained from the TWAS-hub and TWAS analyses of T1D, AID ALL , and AID SURE traits were conducted using the same procedure as JIA.Transcriptome-wide genetic correlations between JIA, T1D, AID ALL , AID SURE , and the 18 traits were estimated using the RHOGE [39].

Weighted gene co-expression network analysis
This study used an Affymetrix microarray dataset (GEO study: GSE13501 [40]) and two RNA-seq datasets (GEO study: GSE112057 [41] and GSE79970 [42]) containing mRNA expression data on JIA patients and healthy controls.The preprocessing and normalization of datasets were performed in accordance with Kim et al. [2].The top 7000 most-expressed genes of the GSE13501 [40] dataset were selected for simplicity after normalization following Jung et al. [43].A signed weighted gene co-expression network analysis (WGCNA) was performed to identify co-expression modules comprising positively correlated genes based on bi-weight mid-correlation [44].A softthresholding power (β) of seven was selected for the network construction (scale-free r 2 = 0.8).The expression profile of each module was summarized by the module eigengene (ME).The minimum size of modules was 50 genes and paired modules with high ME (r > 85) were merged.With the co-expression modules, module preservation analyses were conducted using the microarray dataset as a reference set and each RNA-seq dataset as a test set with the co-expression modules [45].The analyses were permuted up to 1000 times and Z-summary scores were computed to identify the preserved modules.To examine whether TWAS associations were enriched in the co-expression modules, GSEA was performed with the fgsea R package using the co-expression modules as the reference gene sets [46].The TWAS associations from each panel, arranged by their Z-scores in descending order, were used as the pre-ranked gene sets.Functional annotation of the co-expression modules, in which TWAS associations were enriched, was carried out using the Database for Annotation, Visualization, and Integrated Discovery (DAVID) [47].

JIA-relevant-tissue and cell-type analyses
LD score regression to specifically expressed genes (LDSC-SEG) v1.0.1 was applied to determine diseaserelevant tissues and cell types in JIA [48].Two types of precomputed expression datasets were downloaded for the analysis: (1) human RNA-seq data of 53 tissues/ cell types from the GTEx [49] and human/mouse/rat array data of 152 tissues/cell types from Franke lab [50,51] and (2) mouse array data of 292 immune cell types from ImmGen [52].Two types of precomputed chromatin datasets were also downloaded: (1) human 431 tissue-specific epigenomic annotations from peaks for six epigenetic marks from Roadmap Epigenomics [53] and ENCODE projects [54] and (2) human ATAC-seq peaks from 13 cell types for human hematopoietic hierarchy [55].Fetal data was excluded from the epigenomic data.The Bonferroni correction was applied to determine the significance levels.

Estimating the SNP-based heritability and expression-mediated heritability
We estimated SNP-based SNP heritability ( h 2 HESS ) in a set of 1702 independently partitioned genomic blocks [56] across the genome using Heritability Estimation from Summary Statistics (HESS) [39] v0.5.3-beta.To account for the large number of hypotheses tested, the Bonferroni correction was performed at α = 0.05/1702 to determine significant levels.The mediated expression score regression (MESC) [57] was used to estimate the proportion of heritability mediated by a cis-genetic component of gene expression levels ( h 2 med /h 2 g ).The MESC software was downloaded, along with its precomputed expression scores from the GTEx consortium and eQTLGen [58].

Genetic architecture analysis
To identify signatures of negative selection for JIA, we utilized summary-data-based BayesS (SBayesS) [59] in the GCTB software by estimating the joint posterior distribution of effect size and MAF.Based on the Markovchain Monte Carlo sample, the posterior mean was used as a point estimation, and the posterior standard error was approximated by the standard deviation.Moreover, a sparse LD matrix was used for computational efficiency [59].
The expression levels of HLA genes were estimated by seq2HLA [60] using the GSE112057 [41] dataset.FASTQ-formatted files in GSE112057 [41] were used as input files and the locus-specific expression levels of HLA genes were measured as read per kilobase million.HLA class I and II gene expression levels were compared between JIA and control groups.

Profiling adaptive immune repertoires in JIA
The unmapped reads were extracted from the GSE112057 [41] dataset following a read origin protocol (ROP) [70].SRR6868722 and SRR6868696 were excluded due to quality problems.To analyze the TCR repertoires, the reads mapped onto the complementary determining region 3 (CDR3) in TCR loci were identified using immune profiling by ROP (ImReP) [71].ImReP assembles the clonotypes, defined as clones having identical CDR3 amino-acid sequences, and identifies the corresponding V(D)J recombination.The alpha diversity (Shannon entropy) of TCRs was measured within the immune repertoire of an individual.The alpha diversities of TCRs (TCR α and β) were respectively compared between control and JIA groups.The same analyses were conducted using the GSE112957 [41] dataset grouped by JIA subtypes (healthy controls, oligo-JIA, poly-JIA, and sJIA).

Single-cell analysis
The single-cell RNA-seq (scRNA-seq) of JIA were downloaded from NCBI SRA (SRA study: SRP288574 [72]).The 10x Genomics BAM file contains single-cell transcriptome and TCR repertoire data from single CD4 + CD45RO + CD25 − (CD4 + ) and CD8 + CD45RO + (CD8 + ) T cells of synovial fluid (SF) and peripheral blood (PB) tissue in seven oligo-JIA patients.The BAM files mapped to human hg19 references were converted into FASTQ files using the bamtofastq (v1.3.1)tool and remapped to human GRCh38 references (ref-2020-A).The FASTQ data were processed using the cellranger [73] (v6.1.2) count tools and analyzed using the Seurat [74] R package.To preprocess the data, we filtered out cells with either >4000 or <200 distinct features and those with >10% mitochondrial count.We conducted normalization, identification of highly variable features (i.e., feature selection), and scaling data with default settings.After performing dimensional reduction using principal component analysis (PCA), we used the Harmony [75] R package to integrate datasets derived from seven patients and two tissue types.Based on the 40 components of Harmony, we carried out Uniform Manifold Approximation and Projection (UMAP) [76] and nearest-neighbor graph construction.Then, we determined single-cell clusters using 0.5 resolution.We retrieved the results of the single-cell TCR repertoire profiling from NCBI GEO (GEO study: GSE160097).

Statistical analysis
The Bonferroni correction was applied to determine the significance thresholds of TWAS associations (P TWAS < 7.55 × 10 −07 , 0.05/66,196) and PWAS associations (P PWAS < 2.16 × 10 −05 , 0.05/2311).The significance thresholds with the Bonferroni correction were also used in the LDSC-SEG analysis, heritability enrichment analysis, and genetic architecture analysis.A false discovery rate (FDR) was used to determine the significance threshold for the TWAS-GSEA, the MAGMA gene set analysis, the genetic correlation analyses, and the functional annotation analysis using DAVID.The consensus frequencies of HLA allele types were compared using two-sided 2-(control and JIA), 3-(control, African JIA, and European JIA), and 4-sample (control, oligo-JIA, poly-JIA, and sJIA) proportion tests, respectively.Using a two-tailed t-test, HLA class I and II gene expression levels were compared between control and JIA groups.The alpha diversities of TCRs were compared between control and JIA groups using a two-tailed t-test.When grouped by JIA subtypes (control, oligo-JIA, poly-JIA, and sJIA), the alpha diversities of TCRs were compared using one-way ANOVA with post hoc Tukey HSD.
To distinguish between genes likely causal for JIA risk from those that tag risk due to LD and shared regulatory features, we performed a probabilistic fine-mapping analysis by FOCUS [25] using the same expression weights used in our FUSION analyses.Among the 19 TWAS genes identified by FUSION, FOCUS identified nine genes in 90%-credible sets, which we denote as putatively causal genes affected by genome-wide significant GWAS signals (Table S5).Six of the nine (67%) genes had >0.4 PIPs, of which MAGI3 (PIP = 0.5), NFATC2IP (PIP = 0.417), and DCLRE1B (PIP = 0.405) were putatively responsible for JIA risk.Similarly, fine-mapping of PWAS results prioritized IL27 with a PIP of 0.98 in the plasma protein from the INTERVAL study (Table S5).ERAP2 from the ARIC study was included in credible sets with a PIP of 0.23.Altogether, we identified 19 susceptibility genes/ two risk proteins for JIA using TWAS/PWAS and prioritized likely causal genes for JIA risk, suggesting that our findings provide novel insights into the etiology of JIA.

Exploring biological pathways that may contribute to the pathogenesis of JIA
To explore the biological effects derived from overall TWAS associations from the multi-tissue panels, we conducted GSEA with the JIA TWAS results using CP gene sets [26].We found a total of nine CP gene sets were significantly involved with JIA TWAS associations (FDR < 0.05).Three of the nine gene sets are characterized by sulfation (Table 1).The impaired sulfation pathway was reportedly implicated in diastrophic dysplasia leading to cartilage disorder and joint degradations, similar to the clinical manifestations of JIA [82].In addition, tyrosine-sulfated proteins were reported to play roles in the pathogenesis of various AIDs [83].Four other gene sets involved in the IL27 pathway, IL6 family signaling, nitric oxide 2 IL12 (NO2IL12) pathway, and IL17 pathway are associated with immune responses that are representatives of AIDs.The T cell apoptosis pathway was also significantly implicated in JIA associated with the dysregulated T cell responses [84].Moreover, stathmin is known to play a critical role in regulating the cell cycle, and its phosphorylation is reportedly important in T cell activation [85,86].We additionally performed a GWAS-based pathway enrichment analysis using MAGMA [33] with JIA GWAS summary statistics and CP gene sets, given that HLA signals were dropped from TWAS-based results due to complicated LD.The results showed that a total of 40 gene sets were significantly implicated in JIA (FDR < 0.05; Table S6).Most pathways were associated with various immune system components, such as inflammatory cytokines, HLA, immunoglobulins, and T cells.The IL27 pathway, IL6 family signaling, NO2IL12 pathway, and IL17 pathway were observed in both TWAS-GSEA and MAGMA (Tables 1 and S6).The gene sets representing T1D and autoimmune thyroid disease (AITD) were also related to JIA, supporting the idea that JIA is indeed an AID.Collectively, our results suggest that impaired sulfation pathways and immune signaling, especially T cellmediated responses, may contribute to the pathogenesis mechanism of JIA.

Estimation of genetic correlations between JIA and other traits
Considering that the pathway analysis showed certain AIDs, such as T1D and AITD, were associated with JIA, we carried out genetic correlation analyses to further investigate the association of JIA with other AIDs.We estimated the genetic correlations between JIA and other traits categorized into two groups, AID-like and non-AID-like traits, at the genome-and transcriptomewide levels.At the genome-wide level, JIA showed significant positive correlations (FDR LDSC < 0.05) with eight out of the ten (80%) AIDs including systemic lupus erythematosus (SLE) (r = 0.66 and FDR LDSC = 1.24 × 10 −08 ) and exhibited the most significant correlation with the AID ALL trait (r = 0.53 and FDR LDSC = 7.19 × 10 −11 ) (Fig. S3A).At the transcriptome-wide level, we found significant genetic overlaps (FDR RHOGE < 0.05) between JIA and seven AID traits including ulcerative colitis (UC), AID ALL trait, Crohn's disease, rheumatoid arthritis (RA), SLE, primary biliary cirrhosis, and AID SURE trait.JIA was most significantly correlated with UC (r = 0.35 and FDR RHOGE = 1.28 × 10 −06 ) and had the most positive correlation with T1D (r = 0.65 and FDR RHOGE = 6.91 × 10 −02 ) (Fig. S3B).In particular, T1D was simultaneously identified to have highly positive correlations with JIA at the genome-(r = 0.62 and FDR LDSC = 3.58 × 10 −05 ) and transcriptome-wide levels (r = 0.65 and FDR RHOGE = 6.91 × 10 −02 ).Overall, these results support the existence of shared genetic contributions to JIA and AIDs at the genome-and transcriptome-wide levels.

Disease heritability in JIA-relevant tissues and cell types
To better understand how genetic variants affect JIA risks, we aimed to identify the cell types or tissues relevant for the pathogenesis of JIA using LD score regression in specifically expressed genes (LDSC-SEG) [48].LDSC-SEG tests whether disease heritability is enriched in regions of genes in a specific tissue using stratified LD score regression [87], analyzing transcriptome or epigenome data together with GWAS.We first applied this analysis to 53 and 152 tissues or cell types from the GTEx project [49] and Franke lab data [50,51], respectively.Consistent with the reason for using connective tissues in the JIA TWAS, this analysis showed that lymphocytes or blood tissues were enriched in JIA (Fig. S4).Additionally, we detected enrichment for the CD4 + T cells among 292 immune cell types from ImmGen [52] in JIA (Fig. S5).
To support the results from the expression-based LDSC-SEG analysis, we examined whether JIA-related heritability is enriched in epigenetic markers from the ENCODE projects [54] and Roadmap Epigenomics [53].Based on 431 tissue-specific ChIP-seq annotations of six epigenetic marks, we detected an enrichment at the 5% Bonferroni threshold (P < 1.16 × 10 −04 ) for only blood tissue (Fig. S6).Notably, the most significant enrichment was observed in T cells across active promoters and gene markers (Fig. 2A) as well as active enhancer markers (Fig. 2B).To validate the Chip-seq results, we used ATAC-seq data which is associated with chromatin accessibility in 13 blood cell types 55 .In line with the Chip-seq results, we found enrichment in T, B, and NK cells after the Bonferroni correction (P < 3.8 × 10 −03 ; Fig. 2C).
To test whether these enriched tissues and cell types causally mediate JIA risk, we next estimated the proportion of heritability mediated by gene expression levels ( h 2 med /h 2 g ) in a tissue context using mediated expression score regression (MESC) [57].When using the all-tissue meta-analyzed expression scores from GTEx, we observed 0.34 of h 2 med /h 2 g for JIA (standard error (se) = 0.16), which was higher than those from other AIDs (P < 3.37 × 10 −02 ; Fig. 2D).We estimated lower h 2 med /h 2 g from the tissue-group meta-analyzed (i.e., 12 connective tissues) ( h 2 med /h 2 g = 0.217 and se = 0.119) and individual-tissue (i.e., whole blood) expression scores ( h 2 med /h 2 g = 0.137 and se = 0.085) than from all-tissue expression scores in GTEx data (Fig. 2D), consistent with those in previous studies using 42 diseases and complex traits 57 .We identified the highest h 2 med /h 2 g value in the EBV-transformed lymphocytes among 48 GTEx tissues ( h 2 med /h 2 g = 0.189 and se = 0.067; Fig. S7), and validated the results by whole blood data from eQTLGen (Fig. 2E) [58].Together, our findings strongly suggest that the SNP-based heritability of JIA was closely associated with gene expression and active epigenetic markers in blood tissue, especially in T lymphocytes.

Polygenicity in the genetic architecture of JIA
In complex and polygenic traits, dissecting joint distribution of effect size and MAF is important to understand the genetic architecture and to detect signals of natural selection [88].Deleterious mutations to fitness are selected against and maintained at a low frequency by negative (purifying) selection [89,90].Moreover, the negative selection in autoimmune disease is important because it provides insight into the evolution of the human immune system.We conducted genetic architecture analysis using SBayesS, a recently developed method based on the Bayesian mixed linear model with GWAS summary statistics [88] to estimate the relationship between SNP effect size and MAF ( S ).The SBayesS method also allows us to infer multiple genetic architecture parameters including the SNP-based heritability ( h 2 SBayesS ) and polygenicity ( π ) which is the proportion of SNPs with nonzero effects.A negative value of S indi- cates that SNPs with lower MAF are prone to having larger effects, consistent with a model of negative selection.Overall, we estimated S = −0.96(posterior stand- ard error (p.s.e) = 0.10), providing evidence that the genetic variants related to JIA have been under negative selection (Fig. 3A).Excluding SNPs in the HLA region (chr6:28-34Mb) increased the estimate S in JIA ( S = −0.49and p.s.e = 0.2), which suggests that the SNPs in the HLA region contributing toward JIA risk may have been under negative selection.The previous studies showed that the majority of AIDs have genetic relationships with the HLA area, and numerous distinct HLA alleles can predispose people to AIDs [91,92].For the SNP-based heritability, the estimate of h 2 SBayesS for JIA was 0.47 (p.s.e = 0.03), which was higher when compared with other AIDs (Fig. 3A).We also observed that excluding the HLA region reduced the h 2 SBayesS to 0.34 in JIA, which is consistent with previous studies in AIDs [3,93].Importantly, the polygenicity increased to 8.57% after excluding the HLA region, although the estimated polygenicity ( π ) is about 0.09% in JIA.
Moreover, we estimated the SNP heritability ( h 2 HESS ) using HESS [39] in each of 1703 independently partitioned genomic LD blocks across the genome.The results revealed that the total h 2 HESS was 0.28 in JIA (Fig. 3B), slightly lower than that based on SBayesS ( h 2 SBayesS ) (Fig. 3A), and the HLA region significantly explained a total of 17.5% heritability in JIA (Bonferroni-corrected P < 2.94 × 10 −05 ; Fig. 3B).In line with the h 2 SBayesS results, the total h 2 HESS in JIA was higher than the other AIDs.Aside from the HLA region, most of the SNP-based heritability was distributed uniformly across the genome in JIA and other AIDs.Collectively, these results strongly suggest that JIA shows higher polygenicity than other AIDs in both HLA regions and outside of HLA regions, and SNP-based heritability comes from a vastly polygenic background.

Estimation of the consensus frequencies of HLA allele types and HLA gene expression
The HLA genes in the HLA region on chromosome 6p21 encode several essential proteins in the immune system [94].Consistent with the SNP-based heritability results in the JIA (Fig. 3B), variants in the HLA regions explain more heritability than many other variants for many diseases [95][96][97].Identifying HLA allele type is essential to better understanding the disease etiology, and many tools have been developed for HLA typing using NGS data [42,[60][61][62][63][64][65][66].To explore the association between HLA type diversity and JIA risk, we estimated consensus HLA allele type frequencies at 12 major loci using seven HLA-typing software (see "Methods").We focused on the allele types with a significantly different distribution between healthy subjects (n = 12) and JIA patients (n = 115) (P < 0.05; Table S10).At HLA class I loci, the frequency of A*03:01 was approximately tripled in JIA patients (12.89%) compared with that in healthy subjects  S11), the frequencies of DRB1*04:01 and DRB3*02:01 were the highest in poly-JIA patients.Additionally, for the transcriptomic analysis of HLA regions, we estimated the locus-specific expression levels of nine HLA genes in JIA patients and healthy controls.The results showed that the expression levels of HLA-DPA1, DPB1, DQB1, and DRA were significantly lower in JIA patients than in healthy controls (P < 0.05; Fig. S10).HLA class II is known to be strongly associated with susceptibility to many AIDs including JIA [98][99][100].Although various mechanisms could induce the downregulation of HLA class II gene expression, the lowered expression could result in diminished tolerance induction of self-reactive T cells, leading to AIDs in the end [101].
In the HLA region, haplotypes are specific to an individual ancestral population due to the population-specific positive selection [102].We calculated the consensus frequencies of HLA types of JIA in African and European ancestry to investigate whether heterogeneity is derived from different ancestry in HLA allele types for JIA.We focused on the allele types with significantly different distributions between healthy controls (n = 12), African JIA (n = 34), and European JIA patients (n = 81) (P < 0.05; Table S12).At HLA class I loci, B*45:01 was not detected in healthy controls and was more frequent in African JIA patients (11.18%) than European JIA patients (0.62%) (P = 8.33 × 10 −22 ; Fig. 4A).B*40:01, which was not observed in African patients, had higher frequencies in European patients (10.65%) than in healthy controls (5.36%) (P = 4.90 × 10 −02 ).At HLA class II loci, the frequencies of DQA1*01:01 and DQA1*03:01 were increased by more than fourfold in the European JIA patients (13.68% and 12.84%) compared to in the African JIA patients (3.05% and 1.78%), respectively (P DQA1*01:01 = 9.65 × 10 −08 , P DQA1*03:01 = 2.56 × 10 −09 ; Fig. 4B).The frequencies of DRB1*04:01 observed only in JIA patients were higher in European patients (15.10%) than African patients (2.49%) (P = 1.02 × 10 −09 ; Fig. 4B).Collectively, our results showed that HLA allele types had significantly imbalanced distributions between the JIA and control groups as well as between African and European ancestry, which suggests that they may be involved in JIA etiologies.

The T cell receptor repertoire reveals clonal relationships between different subpopulations
Along with HLA molecules, antigen-experienced memory T cells have been implicated as critical drivers of autoimmune inflammation [103][104][105][106]. Consistent with these previous studies, we suggested that T cells were associated with JIA etiology using TWAS and JIA SNPheritability analysis.Therefore, we hypothesize that the TCR repertoire may be involved in JIA etiology because the TCRs mediate the recognition of HLA and provide critical insights into the adaptive immune response in health and disease [107].To dissect whether the TCR repertoire is related to JIA, we estimated the locus-specific alpha diversities of TCR CDR3 reads using the total number of distinct clonotypes and their relative frequencies at the bulk RNA-seq level (GSE112057 [41]).We observed alpha diversities of TCR α and β were significantly decreased in JIA patients compared with controls (P TCRα = 9.60 × 10 −04 and P TCRβ = 1.36 × 10 −02 ), which indicates that the clonotypic diversities of TCRs were reduced in the JIA group (Fig. 5A).When JIA was grouped into three subtypes (oligo-JIA, poly-JIA, and sJIA), the patterns of alpha diversity in all subtypes were consistent with those in the JIA group (Fig. S11 and Table S13).These results suggest that a few clonotypes positively affecting the development of JIA may be dominant in the immune repertoire of JIA patients.
Next, we investigated whether different T cell populations showed differences in TCR diversity at the singlecell level.Single-cell TCR sequencing can accurately measure the diversity of T cell populations, which is critical for understanding the complexity of the immune response by providing paired TCR α and β information [108].We used scRNA-seq data (SRP288574 [72]) derived from single CD4 + CD45RO + CD25 − (CD4 + ) and CD8 + CD45RO + (CD8 + ) T cells in SF and PB tissues of seven oligo-JIA patients.After a series of quality control filters (see "Methods"), nine CD4 + and seven CD8 + T cell clusters were identified (Fig. S12, S13, and 5B), and each cluster represented a distinct distribution of clonotypes (Fig. 5C).Among 67,235 single T cells, 33,855 cells (50.3%) had at least one pair of full-length TCR α and β chains.In addition, 13,113 of the 33,855 cells (38.7%) expressed a full-length α-β chain pair detected at least twice.Expanded clonotype cells were most prevalent in two CD8 + clusters (CD8_C1 and CD8_C3) and one CD4 + cluster (CD4_C5) (Fig. 5C).Compared with the other clusters, the CD8_C1 cluster had higher expression of several markers of recently activated effector memory or effector T cells (designated as CD8 + GZMH + T EMRA cells) and the CD8_C3 cluster had higher expression of several markers of effector memory T cells (designated as CD8 + GZMK + T EM cells) (Table S14).The CD4_C5 cluster showed higher expression of CXCL13 and BHLHE40 (designated as CXCL13 + BHLHE40 + T H cells) (Fig. S14 and Table S14).Each cluster of the CD8 + GZMH + T EMRA , CD8 + GZMK + T EM , and CXCL13 + BHLHE40 + T H cells was observed to have a substantially lower alpha diversity value than most other clusters (Fig. 5D).Additionally, the single T cell analysis by RNA-seq and TCR tracking expansion (STAR TRA C-expa) index, which quantitatively describes tissue clonal expansion, revealed the CD8 + GZMH + T EMRA , CD8 + GZMK + T EM , and CXCL13 + BHLHE40 + T H cells as the clusters with the highest degree of clonal expansion for each cell type (Fig. S15).
To emphasize disease-specific memory T cell clusters, we conducted GSEA using the co-expression modules from case-control JIA expression data as reference gene sets (Fig. S16).The results showed that only the CXCL13 + BHLHE40 + T H cells were positively enriched in the black module gene set (NES = 1.93 and FDR = 1.20 × 10 −6 ; Fig. 5E).Notably, the black module genes, associated with leukocyte migration essential for inflammation and innate immunity [109] (Table S9), were positively enriched in TWAS associations from NTR blood tissue (Fig. S2C).In summary, we identified clonally expanded T cell subpopulations in JIA patients.The CXCL13 + BHLHE40 + T H cells were significantly related to case-control JIA expression data, suggesting that the cells might be potential therapeutic targets for JIA.

Discussion
Using a TWAS/PWAS to identify biologically interpretable susceptibility genes/proteins, we detected 19 TWAS genes and two PWAS proteins significantly associated with JIA risks (Fig. 1).Since trait-associated genes/ proteins identified by TWAS/PWAS do not fully elucidate the disease's causality [110], we performed a finemapping analysis to prioritize putatively causal genes/ proteins for JIA.We found that MAGI3, DCLRE1B, NFATC2IP, IL27, and ERAP2 were responsible for JIA risks and are implicated in the immune system.MAGI3 encodes the PDZ proteins involved in T cell homeostasis and mediates the suppression of the PI3K/Akt pathway [111,112].The downregulation of MAGI3 (Z TWAS = −5.18;Table S3) may lead to an upregulation of the PI3K/Akt pathway, which, in turn, downregulates the differentiation of T cells toward Treg [10,112].DCLRE1B was reportedly implicated in an inherited bone marrow failure syndrome associated with immune deficiency [113].NFATC2IP regulates the nuclear factor of activated T cells (NFAT)-driven transcription of specific cytokine genes, including IL4 in T-helper 2 cells [114].Considering that IL4 production is affected by IL27 [115], NFATC2IP may be linked to the inflammatory cytokine pathways identified by the TWAS-GSEA (Table 1).Consistent with our PWAS result of IL27 (Z PWAS = −5.19;Table S4), the synovial fluid level of IL27 was reported to be significantly decreased in enthesitis-related arthritis (ERA) patients, one of the JIA subtypes [116].Additionally, ERAP2 (Z PWAS = 4.57; Table S4) reportedly has crucial roles in immunomodulating immune responses [117].The SNP in a splice site for ERAP2 had a genomewide significant association with JIA [99], and polymorphism in genes encoding ERAP1 and ERAP2 is known to predispose to ERA [118].While our TWAS genes, MAGI3 and NFATC2IP, were mentioned in previous JIA TWAS studies [81,119,120], we could better understand the genetic contribution of risk genes to JIA pathogenesis by conducting a PWAS together with fine-mapping and TWAS-based pathway enrichment analysis.A proteome-level analysis can provide more relevant biological information, capture alternative splicing, and post-translational modifications about disease mechanisms.
In genetic architecture analysis, SNP-based heritability was spread uniformly throughout the genome aside from a modest fraction in the HLA regions (about 18%) (Fig. 3).The results can be explained by an "omnigenic model".The omnigenic model is a theoretical framework in genetics and genomics that proposes that most traits, diseases, and other complex phenotypes are influenced by the combined effects of a large number of genes, rather than being driven by just a few "core" genes [121,122].
Alongside the TWAS/PWAS and genetic architecture analysis, we conducted HLA-typing analyses as per the disease state and ancestry.In the investigation of JIA susceptibility, HLA typing is an indispensable tool for pinpointing genes and proteins linked to the disease, becoming essential in immunogenetics research [123].Different human populations exhibit different genetic architecture due to diverse LD patterns, and studying multiple ancestries allows for a more comprehensive understanding of HLA variation.Additionally, the genes encoding HLA are characterized by a remarkable degree of polymorphism, meaning they exist in numerous different HLA alleles.The prevalence of these alleles varies considerably across ethnic groups [124,125].Understanding HLA typing among various ancestries can have important clinical consequences, opening doors to more precise medical diagnoses and targeted therapeutic interventions.As the rigorous HLA analysis using seven HLA-typing software, we suggested DRB1*04:01 and DRB3*02:01 showing >10% consensus frequencies in only JIA patients as risk alleles for JIA.In line with our result, it was reported that DRB1*01 and DRB1*04 might be implicated in the genetic predisposition of rheumatoid factor+ JIA and that DRB1*04 was confirmed to be involved in sJIA [126].In addition, DRB1*04:01 and DQB1*03:02 may have a shared contribution to JIA and T1D since specific interactions between DRB1*03:01-DQB1*02:01/DRB1*04:01-DQB1*03:02 genotypes were previously described to increase T1D risks [127].As The colors correspond to the number of clones (i.e.clonal abundance).D A scatter plot showing the alpha diversity of TCR in each cluster.The red and blue colors denote CD4 + and CD8 + T cells, respectively.E Module enrichment analysis between expression levels of scRNA-seq CD4_C5 cluster and black co-expression gene set derived from JIA case-control expression data (GSE13501).The gene list of the black modules is in Table S9.
(See figure on next page.)the distribution of HLA alleles varies among different ancestries, multi-population needs to be considered an important factor in studying the associations between HLA allele types and disease risks [128].Through HLAtyping analysis utilizing JIA patients' multi-ancestry information, we identified ancestry-specific risk alleles in  4).Even though some HLA types were previously reported as risk alleles for JIA in specific populations, few HLA studies have compared risk alleles for JIA between different ancestral groups [123,129].
A recent study showed that there is a hypothesis that HLA risk alleles may affect the risks of autoimmunity by influencing thymic T cell selection [130].T cells with receptors that recognize self-HLA molecules and interact with foreign antigens are selected during T cell development in the thymus [131].In AIDs, self-antigens may induce immune responses by self-reactive T cells similar to how foreign antigens trigger immune responses [132].As the proliferation of antigen-specific lymphocytes is induced by immune responses, we believe that the reduced alpha diversities of TCRs within JIA patients may be attributed to the selective increase of a few different self-reactive clonotypes at the levels of individual patients (Fig. 5).In fact, circulating CD4 + T cells replicating the phenotypical signature of T lymphocytes infiltrating the inflamed synovium were increased in patients with JIA [133].Previous clinical studies also reported that the alpha diversity of the TCR repertoire was significantly reduced in patients with RA or SLE that are also AIDs [134][135][136].At the single-cell level, our results showed that clonally expanded T cell subpopulations in JIA patients, especially CXCL13 + BHLHE40 + T H cells, were significantly involved in JIA risks (Fig. 5).The PD-1 + TOX + BHLHE40 + population of CD4 + T cells was reported to presumably support extrafollicular B cell activation by secreting IL21 and CXCL13 in JIA [72].In addition, it was reported that CXCL13-producing CD4 + T H cells induced in RA synovium may be involved in the recruitment of B cells and circulating follicular helper T cells at inflammation sites [137].
Although our study successfully identified susceptibility genes/proteins for JIA by TWAS/PWAS, functional studies are needed to clarify the exact genetic effects derived from the genes/proteins.We excepted genes from the HLA region due to its structural diversity and long-range LD in the TWAS/PWAS; however, the GWAS trait associations have been more reported in the HLA region than in any other locus [11].To compensate for this situation, we identified the consensus HLA allele types using seven different HLA-typing tools with RNA-seq data.The genetic architecture analysis, especially for SNP-based heritability, requires additional confirmation because we estimated the heritability using GWAS summary statistics data with reference LD dataset.In addition, the publicly available data we used had some limitations.While it is crucial to study the pathological differences between JIA subtypes, the GWAS summary statistics data for JIA comprising 8 subtypes was used because there were very few publicly available GWAS data for each JIA subtype (See "Methods").The scRNA-seq dataset was derived solely from oligo-JIA patients.Since the RNA-seq dataset for HLA typing did not contain ancestral information on healthy subjects, the ancestry-specific risks of HLA alleles need to be further confirmed in the control group.Additionally, most eQTL/pQTL datasets for TWAS and PWAS, along with GWAS summary statistics for JIA, are primarily focused on individuals of European descent.This creates a critical gap in our understanding of complex traits across diverse populations [138].Fortunately, recent advancements in comprehensive public resources have enabled researchers to employ multi-ancestry approaches for TWAS/PWAS [23,139], providing a way for more inclusive and robust genetic analyses for JIA.

Conclusions
Our findings shed new light on the pathogenesis of JIA and provide a strong foundation for future mechanistic studies aimed at uncovering the molecular drivers of JIA.

Figure S2. Functional annotation of co-expression modules enriched with TWAS associations. (A)
A GSEA plot using a ranked gene list from GTEx: Adipose Visceral Omentum and magenta module (normalized enrichment score (NES) = −1.93 and FDR = 0.025) (B) A GSEA plot using a ranked gene list from GTEx: Artery Aorta and magenta module (NES = −1.93,and FDR = 0.024).(C) A GSEA plot using a ranked gene list from NTR: Blood and black module (NES = 1.84, and FDR = 0.024).(D) A GSEA plot using a ranked gene list from GTEx: Muscle Skeletal and yellow module (NES = 1.43, and FDR = 0.049).Supplementary Figure S3.Genetic correlations between JIA and other traits at the genome-and transcriptome-wide levels.AID-like and non-AID-like traits were compared with JIA.The range of genetic correlation coefficient and FDR values are represented by a color bar and symbol size, respectively.The shape of the symbol indicates the significance of the corresponding correlation.(A) Genetic correlations between JIA and other traits at the genome-wide level.(B) Genetic correlations between JIA and other traits at the transcriptome-wide level, based on the TWAS results.S stands for significant and NS for non-significant.Supplementary Figure S4.Heritability enrichment analysis using tissue or cell type expression.Linkage disequilibrium (LD) score regression in specifically expressed genes (LD-SEG) analysis applied to JIA GWAS data.(A) The enrichment results of LD-SEG analysis using GTEx data.(B) The enrichment results of LD-SEG analysis using Franke lab data.The red dotted lines represent a significant threshold based on the Bonferroni correction.P-values of 9.43 × 10 −04 (0.05/53) and 3.28 × 10 −04 (0.05/152) are the significant thresholds of the results from GTEx and Franke lab data, respectively.Supplementary Figure S5.Heritability enrichment analysis using immune cell type expression.The enrichment results of LD-SEG analysis using ImmGen datasets.The black dotted line represents a significant threshold based on the Bonferroni-corrected P < 1.71 × 10 −04 (0.05/292).Supplementary Figure S6.Heritability enrichment analysis using epigenetic markers.The enrichment results of LD-SEG analysis using epigenetic markers of different tissue types.The red dotted lines represent a significant threshold based on the Bonferroni-corrected P-value< 1.   S14.Expression levels of signature genes in each cluster.(A) UMAP plot showing the expression levels of four selected genes.Each hexagon represents summarizing points into binned hexagon cells (The number of bins partitioning the range = 50) using schex R package.The color bar denotes the proportion of observations in the bin greater than 0. (B) Dot plot representing the expression levels of six selected genes.The size of dots denotes the percentage of cells within a cluster and the color bar encodes the average expression level of the selected genes across all cells within a cluster.Supplementary Figure S15.Clonal expansion levels of the clusters.Clonal expansion levels of each cluster in (A) CD4 + T cells and (B) CD8 + T cells.STAR TRA C-expa quantified clonal expansion levels for each JIA patient (n = 7).*FDR < 0.05, **FDR < 0.01, two-sided Wilcoxon test.Supplementary Figure S16.Module enrichment analysis between expression levels of scRNA-seq CD4_C5 cluster genes and co-expression gene sets derived from JIA case-control expression data.GSEA plots between expression levels of scRNA-seq CD4_C5 cluster and co-expression of (A) Tan and (B) Green modules derived from case-control expression data of GSE13501.The gene lists of modules are in Table S7.
Additional file 2: Table S1.The list of 12 reference eQTL panels used in the TWAS for JIA.Table S2.The list of the total TWAS associations for JIA from the connective tissues.Table S3.The list of 35 significant TWAS associations for JIA (PTWAS < 7.55 × 10−07).Table S4.The list of the total PWAS associations for JIA.Asterisks represent significant PWAS associations after Bonferroni correction (PPWAS < 2.16 × 10−05).Table S5.The result of fine-mapping of TWAS/PWAS signals by using the FOCUS.Table S6.Biological pathways significantly involved in JIA, identified by the MAGMA (FDR < 0.05).Table S7.The list of Entrez IDs of genes in 11 coexpression modules identified by WGCNA.Table S8.The result of gene set enrichment analysis with TWAS associations using co-expression modules as reference gene sets.Table S9.Functional annotation of the 11 modules with GO terms using DAVID, respectively.Table S10.HLA allele types having significantly imbalanced distribution between healthy controls and JIA patients (P < 0.05).Table S11.HLA allele types having significantly imbalanced distribution between healthy controls, oligo-JIA, poly-JIA, and sJIA patients (P < 0.05).Table S12.HLA allele types having significantly imbalanced distribution between healthy controls, African JIA, and European JIA patients (P < 0.05).Table S13.The P-values calculated by comparing the alpha diversity of TCRs.Table S14.Expression profiles of gene markers in each T cell cluster.

Fig. 1
Fig. 1 Manhattan plots for JIA TWAS/PWAS result.The upper and lower panels show Manhattan plots for JIA TWAS and PWAS results, respectively.Each dot represents a P-value for a TWAS or PWAS association between JIA and the predicted expression level of a gene or cis-regulated plasma protein.The black and green horizontal dashed lines indicate the significance thresholds of TWAS and PWAS with Bonferroni correction (P TWAS < 7.55 × 10 −07 and P PWAS < 2.16 × 10 −05 ).Statistically significant TWAS associations of JIA are colored by tissue panels.The black dots of the lower panel indicate statistically significant PWAS associations of JIA.The names of TWAS genes and PWAS proteins with PIP > 0.2 in FOCUS are labeled

Fig. 3
Fig. 3 Genetic architecture of JIA.A Estimation of the three genetic architecture parameters for JIA, AID ALL , and AID SURE using Summary-data-based BayesS (SBayesS).The dots and horizontal bars represent the posterior means and standard errors, respectively.The red and orange colors represent SBayesS and SBayesS excluding HLA regions.B Cumulative local SNP heritability across the genome using heritability estimates from Summary statistics (HESS).Total SNP-based genes are indicated.The red color shows SNP heritability explained by the HLA region

Fig. 4
Fig. 4 Consensus frequencies of HLA allele types in healthy controls, African JIA patients, and European JIA patients.A Bar plots showing the consensus frequencies of HLA allele types at HLA class I loci.B Bar plots showing the consensus frequencies of HLA allele types at HLA class II loci.The x-and y-axis indicate the names and frequencies of HLA allele types, respectively.The names of HLA types having significantly different distributions between healthy controls, African JIA patients, and European JIA patients were represented in bold (P < 0.05)

Fig. 5
Fig. 5 TCR diversities in JIA.A Box plots showing the alpha diversities of TCR α and β in healthy control and JIA groups.The y-axis indicates alpha diversity representing the clonotypic diversity of specific TCR locus.Green and red dots indicate samples of healthy controls and JIA patients, respectively.Asterisks denote the significance levels of differences between the clonotypic diversities of TCRs within healthy controls and that within JIA patients.*, P < 0.05; **, P < 0.01; ***, P < 0.001.B A Uniform Manifold Approximation and Projection (UMAP) plot of 67,235 T cells from seven JIA patients showing 16 major clusters (nine for 34,605 CD4 + and seven for 32,630 CD8 + T cells).Each dot represents an individual T cell and color indicates cluster origin.C Left.A bar plot showing the number of single T cells and frequencies of unique and expanded T cell clones in each cluster.The inset numbers indicate the proportion of the cell type in the cluster.Right.UMAP plot showing the clusters with the clone types.The colors correspond to the number of clones (i.e.clonal abundance).D A scatter plot showing the alpha diversity of TCR in each cluster.The red and blue colors denote CD4 + and CD8 + T cells, respectively.E Module enrichment analysis between expression levels of scRNA-seq CD4_C5 cluster and black co-expression gene set derived from JIA case-control expression data (GSE13501).The gene list of the black modules is in TableS9.

Fig. 5 (
Fig. 5 (See legend on previous page.) Figure S2.Functional annotation of co-expression modules enriched with TWAS associations.(A) A GSEA plot using a ranked gene list from GTEx: Adipose Visceral Omentum and magenta module (normalized enrichment score (NES) = −1.93 and FDR = 0.025) (B) A GSEA plot using a ranked gene list from GTEx: Artery Aorta and magenta module (NES = −1.93,and FDR = 0.024).(C) A GSEA plot using a ranked gene list from NTR: Blood and black module (NES = 1.84, and FDR = 0.024).(D) A GSEA plot using a ranked gene list from GTEx: Muscle Skeletal and yellow module (NES = 1.43, and FDR = 0.049).Supplementary FigureS3.Genetic correlations between JIA and other traits at the genome-and transcriptome-wide levels.AID-like and non-AID-like traits were compared with JIA.The range of genetic correlation coefficient and FDR values are represented by a color bar and symbol size, respectively.The shape of the symbol indicates the significance of the corresponding correlation.(A) Genetic correlations between JIA and other traits at the genome-wide level.(B) Genetic correlations between JIA and other traits at the transcriptome-wide level, based on the TWAS results.S stands for significant and NS for non-significant.Supplementary FigureS4.Heritability enrichment analysis using tissue or cell type expression.Linkage disequilibrium (LD) score regression in specifically expressed genes (LD-SEG) analysis applied to JIA GWAS data.(A) The enrichment results of LD-SEG analysis using GTEx data.(B) The enrichment results of LD-SEG analysis using Franke lab data.The red dotted lines represent a significant threshold based on the Bonferroni correction.P-values of 9.43 × 10 −04 (0.05/53) and 3.28 × 10 −04 (0.05/152) are the significant thresholds of the results from GTEx and Franke lab data, respectively.Supplementary FigureS5.Heritability enrichment analysis using immune cell type expression.The enrichment results of LD-SEG analysis using ImmGen datasets.The black dotted line represents a significant threshold based on the Bonferroni-corrected P < 1.71 × 10 −04 (0.05/292).Supplementary FigureS6.Heritability enrichment analysis using epigenetic markers.The enrichment results of LD-SEG analysis using epigenetic markers of different tissue types.The red dotted lines represent a significant threshold based on the Bonferroni-corrected P-value< 1.16 × 10 −04 (0.05/431).Supplementary Figure S7.Estimation of the proportion of heritability mediated by the gene expression levels ().The bar plots show the results of mediated expression score regression (MESC) using GTEx v8 and eQTLGen data.Error bars indicate jackknife standard errors.Supplementary Figure S8.Consensus frequencies of HLA allele types in healthy control subjects and JIA patients.(A) Bar plots showing the consensus frequencies of HLA allele types at HLA class I loci.(B) Bar plots showing the consensus frequencies of HLA allele types at HLA class II loci.The x-and y-axis indicate the names and frequencies of HLA allele types, respectively.The names of HLA types detected only in the JIA group are marked in red.The names of significantly different HLA types observed in the healthy control and JIA groups are represented in bold (P < 0.05).Supplementary Figure S9.Consensus frequencies of HLA allele types in healthy control subjects and JIA patients grouped by subtypes.(A) Bar plots showing the consensus frequencies of HLA allele types at HLA class I loci.(B) Bar plots showing the consensus frequencies of HLA allele types at HLA class II loci.The x-and y-axis indicate the names and frequencies of HLA allele types, respectively.The names of significantly different HLA types observed in the healthy control and 3 JIA subtype groups are represented in bold (P < 0.05).Supplementary Figure S10.Boxplots showing the locus-specific expression levels of HLA class II genes.Boxplots showing the expression levels of HLA class II genes, (A) DPA1,(B) DPB1, (C) DQB1, and (D) DRA, in healthy controls and JIA patients.Asterisks represent the significance levels of difference between the HLA gene expression levels in JIA patients and those in healthy controls.*, P < 0.05; **, P< 0.01.Supplementary Figure S11.The clonotypic diversities of TCRA and TCRB loci in healthy controls and patients of 3 JIA subtypes.Box plots showing the alpha diversities of (A) TCRA and (B) TCRB loci in the healthy controls and 3 JIA subtypes.The y-axis indicates alpha diversity representing the clonotypic diversity.Colored dots indicate samples of healthy controls and JIA patients of specific subtypes, respectively.Asterisks represent the significance levels of differences between the clonotypic diversities of TCRA and TCRB in healthy controls and those in patients of each JIA subtype.Supplementary Figure S12.Quantification of the cluster distribution.

(
A) Bar plot showing cluster distribution of the two different T cell types.(B) Bar plot showing cluster distribution of the seven different JIA patients.(C) Bar plot showing cluster distribution of the two tissue types.The y-axis of the bar graph denotes the normalized proportions.(D) Alluvial diagram showing the distribution of T cell types.Three categorical axes are variables (i.e., origin tissues, clusters, and T cell types) along which the data are grouped.Each horizontal spline (called alluvium) corresponds to a fixed value of each axis variable, indicated by T cell types' of colors.The red and blue colors correspond to CD8 + and CD4 + T cells, respectively.Supplementary Figure S13.UMAP plot for T cell types and tissue types in different clusters.Each dot represents an individual T cell and color indicates cluster origin.Supplementary Figure

Table 1
Biological pathways significantly involved in JIA based on the TWAS associations TWAS associations from the NTR blood panel and the GTEx skeletal muscle panel were significantly involved in black (NES = 1.84 and FDR = 2.40 × 10 −02 ) and yellow (NES = 1.43 and FDR = 4.90 × 10 −02